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Abstract 

This paper presents a formalized framework for defining corecur¬ 
sive functions safely in a total setting, based on corecursion up-to 
and relational parametricity. The end product is a general corecur¬ 
sor that allows corecursive (and even recursive) calls under well- 
behaved operations, including constructors. Corecursive functions 
that are well behaved can be registered as such, thereby increasing 
the corecursor’s expressiveness. The metatheory is formalized in 
the Isabelle proof assistant and forms the core of a prototype tool. 
The corecursor is derived from first principles, without requiring 
new axioms or extensions of the logic. 

Categories and Subject Descriptors F.3.1 [Logics and Mean¬ 
ings of Programs}'. Specifying and Verifying and Reasoning about 
Programs—Mechanical verification; F.4.1 [Mathematical Logic 
and Formal Languages]: Mathematical Logic—Mechanical theo¬ 
rem proving. Model theory 

General Terms Algorithms, Theory, Verification 

Keywords (Co)recursion, parametricity, proof assistants, 
higher-order logic, Isabelle 

1. Introduction 

Total functional programming is a discipline that ensures computa¬ 
tions always terminate. It is invaluable in a proof assistant, where 
nonterminating definitions such asfx = fx-fl can yield contradic¬ 
tions. Hence, most assistants will accept recursive functions only if 
they can be shown to terminate. Similar concerns arise in specifi¬ 
cation languages and verifying compilers. 

However, some processes need to run forever, without their 
being inconsistent. An important class of total programs has been 
identified under the heading of productive coprogramming mEi 
ED: These are functions that progressively reveal parts of their 
(potentially infinite) output. For example, given a type of infinite 
streams constructed by SCons, the definition 

natsFrom n = SCons n (natsFrom («+l)) 


[Copyright notice will appear here once ’preprint’ option is removed.] 


falls within this fragment, since each call to natsFrom produces 
one constructor before entering the nested call. Not only is the 
equation consistent, it also fully specifies the function’s behavior. 

The above definition is legitimate only if objects are allowed 
to be infinite. This may be self-evident in a nonstrict functional 
language such as Haskell, but in a total setting we must care¬ 
fully distinguish between the well-founded inductive (or algebraic) 
datatypes and the non-well-founded coinductive (or coalgebraic) 
datatypes—often simply called datatypes and codatatypes, respec¬ 
tively. Recursive functions consume datatype values, peeling off a 
finite number of constructors as they proceed; corecursive func¬ 
tions produce codatatype values, consisting of finitely or infinitely 
many constructors. And in the same way that induction is available 
as a proof principle to reason about datatypes and terminating re¬ 
cursive functions, coinduction supports reasoning over codatatypes 
and productive corecursive functions. 

Despite their reputation for esotericism, codatatypes have an 
important role to play in both the theory and the metatheory of 
programming. On the theory side, they allow a direct embedding 
of a large class of nonstrict functional programs in a total logic. 
In conjunction with interactive proofs and code generators, this 
enables certified functional programming ca. On the metatheory 
side, codatatypes conveniently capture infinite, possibly branching 
processes. Major proof developments rely on them, including those 
associated with a C compiler |33l, a Java compiler 03, and the 
Java memory model 03 

Codatatypes are supported by an increasing number of proof 
assistants, including Agda, Coq, Isabelle/HOL, Isabelle/ZF, Matita, 
and PVS. They are also present in the CoALP dialect of logic 
programming and in the Dafny specification language. But the 
ability to introduce codatatypes is not worth much without adequate 
support for defining meaningful functions that operate on them. For 
most systems, this support can be characterized as work in progress. 
The key question they all must answer is: What right-hand sides 
can be safely allowed in a function definition? 

Generally, there are two main approaches to support recursive 
and corecursive functions in a proof assistant or similar system: 

The intrinsic approach: A syntactic criterion is built into the 
logic: termination for recursive specifications, productivity (or 
guardedness) for corecursive specifications. The termination or 
productivity checker is part of the system’s trusted code base. 

The foundational approach: The (co)recursive specifications are 
reduced to a fixpoint construction, which permits a simple def¬ 
inition of the form f = ..., where f does not occur in the right- 
hand side. The original equations are derived as theorems from 
this internal definition, using dedicated proof tactics. 

Systems favoring the intrinsic approach include the proof assistants 
Agda and Coq, as well as tools such as CoALP and Dafny. The 
main hurdle for their users is that syntactic criteria are inflexible; 
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the specification must be massaged so that it falls within a given 
syntactic fragment, even though the desired property (termination 
or productivity) is semantic. But perhaps more troubling in systems 
that process theorems, soundness is not obvious at all and very 
tedious to ensure; as a result, there is a history of critical bugs 
in termination and productivity checkers, as we will see when we 
review related work (Section]^. Indeed, Abel (2) observed that 

Maybe the time is ripe to switch to a more semantical notion 
of termination and guardedness. The syntactic guard condi¬ 
tion gets you somewhere, but then needs a lot of extensions 
and patching to work satisfactory in practice. Formal verifi¬ 
cation of it becomes too difficult, and only intuitive justifi¬ 
cation is prone to errors. 

In contrast to Agda and Coq, proof assistants based on higher- 
order logic (HOL), such as HOL4, HOL Light, and Isabelle/HOL, 
generally adhere to the foundational approach. Their logic is ex¬ 
pressive enough to accommodate the (co)algebraic constructions 
underlying (co)datatypes and (co)recursive functions in terms of 
functors on the category of sets (49]. The main drawback of this 
approach is that it requires a lot of work, both conceptual and im- 
plementational. Moreover, it is not available for all systems, since 
it requires an expressive enough logic. 

Because every step must be formally justified, foundational 
definitional principles tend to be simpler and more restrictive than 
their intrinsic counterparts. As a telling example, codatatypes were 
introduced in Isabelle/HOL only recently, almost two decades after 
their inception in Coq, and they are still missing from the other 
HOL systems; and corecursion is limited to the primitive case, in 
which corecursive calls occur under exactly one constructor. 

That primitive corecursion (or the slightly extended version 
supported by Coq) is too restrictive is an observation that has been 
made repeatedly by researchers who use corecursion in Coq and 
now also Isabelle. Lochbihler and Hdlzl dedicated a paper (^ to 
ad hoc techniques for defining operations on corecursive lists in 
Isabelle. Only after introducing a lot of machinery do they manage 
to define their central example—Ifilter, a filter function on lazy 
(coinductive) lists—and derive suitable reasoning principles. 

We contend that it is possible to combine advanced features 
as found in Agda and Coq with the fundamentalism of Isabelle. 
The lack of built-in support for corecursion, an apparent weakness, 
reveals itself as a strength as we proceed to introduce rich notions of 
corecursion, without extending the type system or adding axioms. 

In this paper, we formalize a highly expressive corecursion 
framework that extends primitive corecursion in the following 
ways: It allows corecursive calls under several constructors; it al¬ 
lows well-behaved operators in the context around or between the 
constructors and around the corecursive calls; importantly, it sup¬ 
ports blending terminating recursive calls with guarded corecursive 
calls. This general corecursor is accompanied by a corresponding, 
equally general coinduction principle that makes reasoning about 
it convenient. Each of the corecursor, mixed recursor-corecursor, 
and the coinduction principle grow in expressiveness during the in¬ 
teraction with the user, by learning of new well-behaved contexts. 
The constructions draw heavily from category theory. 

Before presenting the technical details, we first show through 
examples how a primitive corecursor can be incrementally enriched 
to accept ever richer notions of corecursive call context (Section]^. 
This is made possible by the modular bookkeeping of additional 
structure for the involved type constructors, including a relator 
structure. This structure can be exploited to prove parametricity 
theorems, which ensure the suitability of operators as participants 
to the call contexts, in the style of coinduction up-to. Each new 
corecursive definition is a potential future participant (Section]^. 


This extensible corecursor gracefully handles codatatypes with 
nesting through arbitrary type constructors (e.g., for infinite-depth 
Rose trees nested through finite or infinite lists). Thanks to the 
framework’s modularity, function specifications can combine core¬ 
cursion with recursion, yielding quite expressive mixed fixpoint 
definitions (Section|^. This is inspired by the Dafny tool, but our 
approach is semantically founded and hence provably consistent. 

The complete metatheory is implemented in Isabelle/HOL, as a 
combination of a generic proof development parameterized by ar¬ 
bitrary type constructors and a tool for instantiating the metatheory 
to user-specified instances (Section]^ 1131 ). 

Techniques such as corecursion and coinduction up-to have 
been known for years in the process algebra community, before 
they were embraced and perfected by category theorists (Section]^. 
This work is part of a wider program aiming at bringing insight 
from category theory into proof assistants 11211491 . The main con¬ 
tributions of this paper are the following: 

• We represent in higher-order logic an integrated framework for 
recursion and corecursion able to evolve by user interaction. 

• We identify a sound fragment of mixed recursive-corecursive 
specifications, integrate it in our framework, and present several 
examples that motivate this feature. 

• We implement the above in Isabelle/HOL within an interactive 
loop that maintains the recursive-corecursive infrastructure. 

• We use this infrastructure to automatically derive many exam¬ 
ples that are problematic in other proof assistants. 

A distinguishing feature of our framework is that it does not 
require the user to provide type annotations. On the design space, 
it lies between the highly restrictive primitive corecursion and the 
more bureaucratic up-to approaches such as clock variables Eiiia 
and sized types (3|, combining expressiveness and ease of use. The 
identification of this “sweet spot” can also be seen as a contribution. 

2. Motivating Examples 

We demonstrate the expressiveness of the corecursor framework 
by examples, adopting the user’s perspective. The case studies by 
Rutten 1461 and Hinze ED on stream calculi serve as our starting 
point. Streams of natural numbers can be defined as 

codatatype Stream = SCons (head : Nat) (tail : Stream) 

where SCons : Nat —> Stream —> Stream is the constructor and 
head : Stream Nat, tail : Stream ^ Stream are its selectors. 
Although the examples may seem simple or contrived, they were 
carefully chosen to show the main difficulties that arise in practice. 

2.1 Corecursion Up-to 

As our first example of a corecursive function definition, we con¬ 
sider the pointwise sum of two streams: 

xs ®ys = SCons (head xi + head y.s) (tail xs © tail yi) 

The specification is productive, since the corecursive call occurs 
directly under the stream constructor, which acts as a guard (shown 
underlined). Moreover, it is primitively corecursive, because the 
topmost symbol on the right-hand side is a constructor and the 
corecursive call appears directly as an argument to it. 

These syntactic restrictions can be relaxed to allow conditional 
statements and ‘let’ expressions (13, but despite such tricks prim¬ 
itive corecursion remains hopelessly primitive. The syntactic re¬ 
striction for admissible corecursive definitions in Coq is more per¬ 
missive in that it allows for an arbitrary number of constructors to 
guard the corecursive calls, as in the following definition: 

onetwos = SCons 1 ( SCons 2 onetwos) 
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Our framework achieves the same result by registering SCons as 
a well-behaved operation. Intuitively, an operation is well behaved 
if it needs to destruct at most one constructor of input to produce 
one constructor of output. For streams, such an operation may in¬ 
spect the head and the tail (but not the tail’s tail) of its arguments 
before producing an SCons. Because the operation preserves pro¬ 
ductivity, it can safely surround the guarding constructor. 

The rigorous definition of well-behavedness will capture this 
intuition in a parametricity property that must be discharged by the 
user. In exchange, the framework yields a strengthened corecursor 
that incorporates the new operation. 

The constructor SCons is well behaved, since it does not even 
need to inspect its arguments to produce a constructor. In contrast, 
the selector tail is not well behaved—it must destruct two layers of 
constructors to produce one: 

tail xs = SCons (head (tail xi)) (tail (tail xs)) 

The presence of non-well-behaved operations in the corecursive 
call context is enough to break productivity, as in the example 
stallA = SCons 1 (tail stallA), which stalls immediately after pro¬ 
ducing one constructor, leaving tail stallA unspecified. 

Another instructive example is the function that keeps every 
other element in a stream: 

everyOtherxi- = SCons (head xs) (everyOther (tail (tailxi))) 

The function is not well behaved, despite being primitive core¬ 
cursive. It also breaks productivity: stallB = SCons 1 (everyOther 
stallB) stalls after producing two constructors. 

Going back to our first example, we observe that the operation © 
is well behaved. Hence, it is allowed to participate in corecursive 
call contexts when defining new functions. In this respect, the 
framework is more permissive than Coq’s syntactic restriction. For 
example, we can define the stream of Fibonacci numbers in either 
of the following two ways: 

fi bA = SCons 0 (SCons 1 fi bA © fi bA) 

fibB = SCons 0 (SCons 1 fibB) © SCons 0 fibB 

Well-behaved operations are allowed to appear both under the con¬ 
structor guard (as in fibA) and around it (as in fibB). Notice that two 
guards are necessary in the second example—one for each branch 
of the © operator. Incidentally, we are not aware of any other frame¬ 
work that allows such definitions. Without rephrasing the specifi¬ 
cation, fibB cannot be expressed in Rutten’s format of behavioral 
differential equations na or in Hinze’s syntactic restriction (22, 
nor via Agda copattems 121161. 

Many useful operations are well behaved and can therefore par¬ 
ticipate in further definitions. Following Rutten, the shuffle product 
(8> of two streams is defined in terms of ©. Shuffle product being 
itself well behaved, we can employ it to define stream exponentia¬ 
tion, which also turns out to be well behaved: 

xi © = SCons (head xs x head ys) 

((xi © tail ys) © (tail xs © yj)) 

exp xs = SCons (2' head xs) (tail xs © expxj) 

Next, we use the defined and registered operations to specify two 
streams of factorials of natural numbers facA (starting at 1) and 
facB (starting at 0): 

facA = SCons 1 facA © SCons 1 facA 

facB = exp (SCons 0 facB) 

Computing the first few terms of facA manually should convince 
the reader that productivity and efficiency are not synonymous. 

The arguments of well-behaved operations are not restricted to 
the Stream type. For example, we can define the well-behaved 
supremum of a finite set of streams by primitive corecursion: 


sup X = SCons (LJ (fimage head 2f)) (sup (fimage tail X)) 

Here, fimage gives the image of a finite set under a function, and 
LJ X is the maximum of a finite set of naturals or 0 if Z is empty. 

2.2 Nested Corecursion Up-to 

Although we use streams as our main example, the framework 
generally supports arbitrary codatatypes with multiple curried con¬ 
structors and nesting through other type constructors. To demon¬ 
strate this last feature, we introduce the type of finitely branching 
Rose trees of potentially infinite depth with numeric labels: 

codatatype Tree = Node (val : Nat) (sub : List Tree) 

The type Tree has a single constructor Node : Nat —> List Tree —> 
Tree and two selectors val: Tree ^ Nat and sub: Tree List Tree. 
The recursive occurrence of Tree is nested in the familiar polymor¬ 
phic datatype of finite lists. 

We first define the pointwise sum of two trees analogously to ©: 

f EB M = Node (val f +val u) 

(map (A{/, u'). t' EB u') (zip (sub t) (sub u))) 

Here, map is the standard map function on lists, and zip converts 
two parallel lists into a list of pairs, truncating the longer list 
if necessary. The criterion for primitive corecursion for nested 
codatatypes requires the corecursive call to be applied through 
map, which is the case for EB. Moreover, by virtue of being well 
behaved, EB can be used to define the shuffle product of trees: 

f El u = Node (val xs x val yi-) 

(map (A{/, u'). [t E u') EB (/ E u)) (zip (sub t) (sub u))) 

Again, the corecursive call takes place inside map, but this time 
also in the context of EB. The specification of El is corecursive up-to 
and well behaved. 

2.3 Mixed Recursion-Corecursion 

It is often convenient to let a corecursive function perform some 
finite computation before producing a constructor. With mixed 
recursion-corecursion, a finite number of unguarded recursive calls 
perform this calculation before reaching a guarded corecursive call. 

The intuitive criterion for accepting such definitions is that the 
unguarded recursive call could be unfolded to arbitrary finite depth, 
ultimately yielding a purely corecursive definition. An example is 
the primes function taken from Di Gianantonio and Miculan 113: 

primes m« = if(m = 0A«>l)V gcd mn=\ 

then SCons n (primes {mxn) (u + 1)) 
else primes m (u + 1) 

When called with m = 1 and n = 2, this function computes the 
stream of prime numbers. The unguarded call in the else branch 
increments its second argument n until it is coprime to the first argu¬ 
ment m (i.e., the greatest common divisor of m and n is 1). For any 
positive integers m and n, the numbers m and m x u +1 are coprime, 
yielding an upper bound on the number of times n is increased. 
Hence, the function will take the else branch at most finitely of¬ 
ten before taking the then branch and producing one constructor. 
There is a slight complication when m = 0 and « > 1: Without the 
first disjunct in the if condition, the function could stall. (This cor¬ 
ner case was overlooked in the original example 03.) 

Mixed recursion-corecursion also allows us to give a definition 
of factorials without involving any auxiliary stream operations: 

facC nai = if i = 0 then SCons a (facC ("+ 1) 1 (”+ 1)) 
else facC n {ax i) (i — 1) 

The recursion in the else branch computes the next factorial by 
means of an accumulator a and a decreasing counter i. When the 
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counter reaches 0, facC corecursively produces a constructor with 
the accumulated value and resets the accumulator and the counter. 
Unguarded calls may also occur under well-behaved operations: 

cat n = if n > 0 then cat (n — 1) © SCons 0 feat («+l)) 
else SCons 1 (cat 1) 

The call cat 1 computes the stream of Catalan numbers: Ci, C 2 , • •., 
where C; = (^"). This fact is far from obvious. Productivity is 

not entirely obvious either, but it is guaranteed by the framework. 

When mixing recursion and corecursion, it is easy to get things 
wrong in the absence of solid foundations. Consider this appar¬ 
ently unobjectionable specification in which the corecursive call 
is guarded by SCons and the unguarded call’s argument strictly de¬ 
creases toward 0: 

nasty n = if n <2 then SCons n (nasty («+l)) 
else inc (tail (nasty (n— 1))) 

Here, inc = smap {Ax. x+ 1) and smap is the map function on 
streams. A simple calculation reveals that this specification is in¬ 
consistent because the tail selector before the unguarded call de- 
structs the freshly produced constructor from the other branch: 

nasty 2 = inc (tail (nasty 1)) 

= inc (tail (SCons 1 (nasty 2))) 

= inc (nasty 2) 

This is a close cousin of the infamous f x = f x+ I example men¬ 
tioned in the introduction. The framework rejects this specification 
on the grounds that the tail selector in the recursive call context is 
not well behaved. 

We conclude this section with a practical example from the 
literature. Given the polymorphic type 

codatatype LList(A) = 

LNil I LCons (head : A) (tail : LList(A)) 

of lazy lists, the task is to define the function Ifilter: (A Bool) —> 
LList(A) —> LList(A) that filters out all elements failing to satisfy 
the given predicate. Thanks to the support for mixed recursion- 
corecursion, the framework transforms what was for Lochbihler 
and Hdlzl I36l a research problem into a routine exercise: 

Ifilter P xs = if \/x E xs. ^ P xs 
then LNil 
else if P (head 

then LCons (head xs) (Ifilter P (tail x:.?)) 
else Ifilter P (tail xi) 

The first self-call is corecursive and guarded by LCons, whereas 
the second self-call is terminating, because the number of “false” 
elements until reaching the next “true” element (whose existence is 
guaranteed by the first if condition) decreases by one. 

2.4 Coinduction Up-to 

Once a corecursive specification has been accepted as productive, 
we normally want to reason about it. In proof assistants, codata¬ 
types are accompanied by a notion of structural coinduction that 
matches primitively corecursive functions. For nonprimitive speci¬ 
fications, our framework provides the more advanced proof princi¬ 
ple of coinduction up to congruence—or simply coinduction up-to. 
The structural coinduction principle for streams is as follows: 

Rl r Us t. R St —t head s = head t f\R (tail s) (tail t) 

Coinduction allows us to prove an equality on streams by providing 
a relation R that relates the left-hand side with the right-hand 
side (first premise) and that constitutes a bisimulation (second 
premise). Streams that are related by a bisimulation cannot be 


distinguished by taking observations (i.e., by applying the head 
and tail selectors); therefore they must be equal. In other words, 
equality is the largest bisimulation. 

Creativity is generally required to instantiate R with a bisimula¬ 
tion. However, given a goal l = r, the following canonical candidate 
often works: Tx t. 3xs. s = I A t = r, where xs are variables occur¬ 
ring free in I or r. As a rehearsal, let us prove that the primitively 
corecursive operation © is commutative. 

Proposition 1. xi © = yx © xs. 

Proof. We first show that R = (Ti t. 3xj ys. s = xs ® ys A t = ys (B 
xi) is a bisimulation. We fix two streams j and t for which we 
assume R s t (i.e., there exist two streams x^ and yi such that x = 
xs © ys and t = ys (B xs). Next, we must show that head x = head t 
and R (tail s) (tail t). The first property can be discharged by a 
simple calculation. For the second one: 

R (tail .?) (tail t) 

AA R (tail (xi © yi)) (tail (yj © xi)) 

AA R (tail xs © tail ys)) (tail ys © tail xs) 

AA 3x7 ys' . tail xs © tail ys = xs' © ys' A 
tail ys © tail xs = ys' © xs' 

The last formula can be shown to hold by selecting xs' = tail xi and 
ys' = tail yj. Moreover, R {xs © yi) (y^ © xs) holds. Therefore, the 
thesis follows by structural coinduction. □ 

If we attempt to prove the commutativity of © analogously, we 
eventually encounter a formula of the form R {■■■ (B ■•■){■■■ (B ■■ ■), 
because © is defined in terms of ©. Since R mentions only © but 
not ©, we are stuck. An ad hoc solution would be to replace the 
canonical R with a bisimulation that allows for descending under ©. 
However, this would be needed for almost every property about ©. 

A more reusable solution is to strengthen the coinduction prin¬ 
ciple upon registration of a new well-behaved operation. The 
strengthening mirrors the acquired possibility of the new opera¬ 
tion to appear in the corecursive call context. It is technically repre¬ 
sented by a congruence closure cl: (Stream ^ Stream —>■ Bool) —> 
Stream —>■ Stream —> Bool. The coinduction up-to principle is al¬ 
most identical to structural coinduction, except that the corecursive 
application of R is replaced by cl R: 

Rl r f/st.Rst —:• head i = head r A cl (tail s) (tail t) 

' T^r ■ 

The principle evolves with every newly registered well-behaved 
operation in the sense that our framework refines the definition of 
the congruence closure cl. (Strictly speaking, a fresh symbol cL is 
introduced each time.) For example, after registering SCons and ©, 
cl R is the least reflexive, symmetric, transitive relation containing R 
and satisfying the rules 

X = y cl I? xi ys cl R xs ys cl R xs' ys' 

cl R (SCons xxi) (SCons yyy) cl R {xs © xs') {ys © ys') 

After defining and registering ©, the relation cl R is extended to 
also satisfy 

cl R xs ys cl R xs' ys' 
cl R (xj © xs') {ys © ys') 

Let us apply the strengthened coinduction principle to prove the 
distributivity of stream exponentiation over pointwise addition: 

Proposition 2. exp (xx © yi) = exp xs © exp yi. 

Proof. We first show that R = {As t. 3xi yi. i = exp (xi © yi) At = 
exp Xi © exp yi) is a bisimulation. We fix two streams s and t for 
which we assume R st (i.e., there exist two streams xi and yi such 
that i = exp (xi © yi) and t = exp xi © exp yi). Next, we show that 
head i = head t and cl R (tail i) (tail t): 
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head ^ = head (exp {xs © ys)) = 2" head (x^ © ys) 

= 2 ' (head xs + head yi) = 2 " head xsx 2 " head ys 
= head (expxi) x head (exp yi) 

= head (exp xs 0 exp ys) = head t 

cl R (tail s) (tail t) 

> cl R (tail (exp {xs © y^))) (tail (exp x^ ® exp y^)) 

> cl R ((tail xs © tail yj) 0 exp {xs © y^)) 

(exp xs 0 (tail y5 0 exp ys) © (tail xi 0 exp xi) 0 exp yj) 

■<—> cl R ((tail xs 0 exp {xs © ys) © tail ys 0 exp {xs © yi)) 

^ (tail xj 0 (exp xs 0 exp yi) © tail yj 0 (exp xs 0 exp yx) 

<— cl R (tail xs 0 exp {xs © yi)) (tail xs 0 (exp xs 0 exp y.s)) A 
^ cl (tail yi 0 exp (xi © ys)) (tail ys 0 (exp xs 0 exp y.s)) 

■<— cl R (tail xi) (tail xs) A cl R (tail ys) (tail yj) A 
cl R (exp (xi © yi)) (exp xs 0 exp y^) 

— R (exp {xs © ys)) (exp x^ 0 exp ys) 

The step marked with * appeals to associativity and commutativity 
of © and 0 as well as distributivity of 0 over ©. These properties 
are likewise proved by coinduction up-to. The implications marked 
with © and 0 are justified by the respective congruence rules. The 
last implication uses reflexivity and expands R to its closure cl R. 

Finally, it is easy to see that R (exp (xi- © ys)) (exp xs 0 exp yj) 
holds. Therefore, the thesis follows by coinduction up-to. □ 

The formalization accompanying this paper ini also contains 
proofs of facA = facC 111= smap fac (natsFrom 1), facB = 
SCons 1 facA, and fibA = fibB, where fac is the factorial on Nat. 

Nested corecursion up-to is also reflected with a suitable 
strengthened coinduction rule. For Tree, this strengthening takes 
place under the rel operator on list, similarly to the corecursive 
calls occurring nested in the map function: 

R / r 'rfst.Rst — :• val j = val f A rel (cl R) (sub .?) (sub t) 

(^7 

The rel R operator lifts the binary predicate R\ A ^ B ^ Bool to 
a predicate List A -A List B -A Bool. More precisely, rel R xs ys 
holds if and only if xs and ys have the same length and parallel 
elements of xs and y^ are related by R. This nested coinduction 
rule is convenient provided there is some infrastructure to descend 
under rel (as is the case in Isabelle/HOL). The formalization dSi 
establishes several arithmetic properties of EB and Kl. 

3. Extensible Corecursors 

We now describe the definitional and proof mechanisms that sub¬ 
stantiate flexible corecursive definitions in the style of Section!^ 
They are based on the modular maintenance of infrastructure for 
the corecursor associated with a codatatype, with the possibility of 
open-ended incremental improvement. We present the approach for 
an arbitrary codatatype given as the greatest fixpoint of an arbitrary 
(bounded) functor. The approach is quite general and does not rely 
on any particular grammar for specifying codatatypes. 

Extensibility is an integral feature of the framework. In princi¬ 
ple, an implementation could redo the constructions from scratch 
each time a well-behaved operation is registered, but it would 
give rise to a quadratic number of definitions, slowing down the 
proof assistant. The incremental approach is also more flexible and 
future-proof, allowing mixed fixpoints and composition with other 
(co)recursors, including some that do not exist yet. 

3.1 Functors and Relators 

Functional programming languages and proof assistants necessar¬ 
ily maintain a database of the user-defined types or, more generally, 
type constructors, which can be thought as functions F : Set" =• Set 
operating on sets (or perhaps on ordered sets). It is often useful to 
maintain more structure along with these type constructors: 


• a functorial action Fmap : IliBeSet" H/Li Bf) -A F A —> 

F B, i.e., a polymorphic function of the indicated type that 
commutes with identity id ,4 : A —> A and composition; 

• a relator Frel : BgSet" Bool) —> F A —> 

F S —> Bool, i.e., a polymorphic function of the indicated type 
which commutes with binary-relation identity and composition. 

Following standard notation from category theory, we write F in¬ 
stead of Fmap. Given binary relations R, : A, —> B, -A Bool for 
1 < i < n, we think of Frel : F A =• F 6 ^ Bool as the natural 
lifting of R along F; for example, if F is List (and hence n = 1), 
Frel lifts a relation on elements to the componentwise relation on 
lists (also requiring equal length). It is well known that the positive 
type consh'uctors defined by standard means (basic types, compo¬ 
sition, least or greatest fixpoints) have canonical functorial and re¬ 
lator structure. This is crucial for the foundational construction of 
user-specified (co)datatypes in Isabelle/HOL 1491 . 

But even nonpositive type constructors G : Set" ^ Set ex¬ 
hibit a relator-like structure Grel : Ha BeSet" (A -A 6) —> (G A —> 
G S —> Bool) (which need not commute with relation composi¬ 
tion, though). For example, if G : Set^ —>■ Set is the function- 
space constructor G (Ai,A 2 ) = A; =■ A 2 and / £ G (Ai,A 2 ), 
g £ G (61, B2), : Ai -A 61 =• Bool, and i?2 • A2 62 -A Bool, 

then Grel R\ R 2 f g defined as Vfli £ A;. Vfii £ fii. ai b\ —> 
^2 {f ^ 1 ) {g bi). A polymorphic function c : OigSet" G A, c 
is called parametric 1411 1521 if VA, B £ Set". \IR : A —t B ^ 
Bool. Grel R cj c-g. The maintenance of relator-like structures is 
very helpful for automating theorem transfer along isomorphisms 
and quotients Cl. Here we explore an additional benefit of main¬ 
taining functorial and relator structure for type constructors: the 
possibility to extend the corecursor in reaction to user input. 

In this section, we assume that all the considered type construc¬ 
tors are both functors and relators, that they include basic func¬ 
tors such as identity, constant, sum, and product, and that they are 
closed under least fixpoints (initial algebras) and greatest fixpoints 
(final coalgebras). Examples of such classes of type constructors 
include the datafunctors 1201 . the containers m, and the bounded 
natural functors (49). 

We focus on the case of a unary codatatype-generating functor 
F : Set —>■ Set. The codatatype of interest will be its greatest fixpoint 
(or final coalgebra) J = gfp F. This generic situation already covers 
the vast majority of interesting codatatypes, since F can represent 
arbitrarily complex nesting. For example, if F = (TA. Nat x List A), 
then J corresponds to the Tree codatatype presented in Section|2^ 
The extension to mutually defined codatatypes is straightforward 
but tedious. Our examples will take J to be the Stream type from 
Section|^ with F = {AA. Nat x A). 

Given a set A, it will be useful to think of the elements x £ F A as 
consisting of a shape together with content that fills the shape with 
elements of A. If F A = Nat x A, the shape of jc = (n, a) is (n, _) 
and the content is a; if F A = List A, the shape of x = [xi,... ,x„] 
is the n-slot container and the content consists of the xfs. 

According to this view, for each f : A ^ B, the functorial 
action associated with F sends any x into an element F / x of the 
same shape as x but with each content item a replaced by / a. 
Technically, this view can be supported by custom notions such 
as containers LU or, more simply, via a parametric function of type 
riAeSet F A -A Set A that collects the content elements 1491 . 

3.2 Primitive Corecursion 

The codatatype that defines J also introduces the constructor and 
destructor bijections ctor: F J ^ J and dtor: J ^ F J and the prim¬ 
itive corecursor corecPrim : n 4 gSet(A --A F A) ^ A =■ J character¬ 
ized by the equation corecPrim sa = ctor (F (corecPrim s) {s a)). 
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In elements jc G F A, the occurrences of content items a G A in the 
shape of x captures the positioning of the corecursive calls. 

Example 3. Modulo currying, the pointwise sum of streams © is 
definable as corecPrim s, by taking s : Stream^ —t Nat x Stream^ 
to be A{xs, ys). (head xs + head ys, (tail xs, tail yj)). 

In Example and elsewhere, we lighten notation by identify 
the curried and uncurried forms of functions, counting on implicit 
coercions between the two. 

3.3 The Corecursion State 

Given any functor 51 : Set ^ Set, we define its free-monadfunctor 
T* by 5:*A = Ifp {AB. A + TB). We write leaf : A ^ 5:*A and 
op : T. (5I*A) —>• 5I*A for the left and right injections into 5I*A. 

The functions leaf and op are in fact polymorphic; for example, 
leaf has type rX4gSet ^ 5I*A. We often omit the set parameters 

of polymorphic functions if they can be inferred from the context, 
writing leaf and op instead of leaf^ and op^. 

At any given moment, we maintain the following data associ¬ 
ated with J, which we call a corecursion state’. 

• a finite number of functors Ki,..., K„ : Set Set and, for each 
K;, a function /,■ : K/ J —> J; 

• a polymorphic function A : ILieSet 51 (A x F A) —>■ F (5I*A). 

We call the ffs the well-behaved operations and define their collec¬ 
tive signature functor T. as AA. Ki A H-h K„ A, where i;: K,- —!• 51 

is the standard embedding of K; into 51. We call A the corecursor 
seed. The corecursion state is subject to the following conditions: 

Parametricity: A is parametric. 

Well-behavedness: Each f satisfies the characteristic equation 
fi x = ctor (F eval (A (51 (id, dtor) (i/ x)))) 

The convolution operator (_, _) builds a function (/, g) : 6 —>■ C x £> 
from two functions / : 6 —!■ C and g : 6 —> £>, and eval : 51* J J 
is the canonical evaluation function defined recursively (using the 
primitive recursor associated with 51*): 

eval (leafy) = y 

eval (op z) = case z of t; t^f (K,- eval t) 

(Note that, on the recursive call on the right eval, is applied to t via 
“lifting” it through the functor Kj.) Functions having the type of 
A and additionally assumed parametric (or, equivalently, assumed 
to be natural transformations) are known in category theory as 
“abstract GSOS rules.” They were introduced by Turi and Plotkin 
ESI and further studied by Bartels O, Jacobs 1261 . Hinze and 
James 1221 . Milius et al. ll^ . and others. 

Thus, a corecursion state is a triple (K,/, A). As we will see 
in Section the state evolves as users define and register new 
functions. The ffs are the operations that have been registered as 
safe for participating in the context of corecursion calls. Since f 
has type K, J —t J, we think of K/ as encoding the arity of f. Then 
51, the sum of the K, ’s, represents the signature consisting of all the 
ffs. Thus, for each A, 5I*A represents the set of formal expressions 
over Z and A, i.e., the trees built starting from the “variables” in 
A as leaves by applying operations symbols corresponding to the 
ffs. Finally, eval evaluates in J the formal expressions of 51*J by 
applying the functions f recursively. 

If the functors K; are restricted to be finite monomials AA. A^f 
the functor 51 can be seen as a standard algebraic signature and 
(5I*A, op) as the standard term algebra for this signature, over the 
variables A. However, we allow K,- to be more exotic; for example, 
K; A can be A^^* (representing an infinitary operation) or one of 
List A and FinSet A (representing an operation taking a varying 
finite number of ordered or unordered arguments). 


K,J 

AoY. (id,dtor)of, 

Y 


F(r*j) ^ F J 


Figure 1: The well-behavedness condition 


But what guarantees that the ffs are indeed safe as contexts for 
corecursive calls? In particular, how can the framework exclude tail 
while allowing SCons, ©, and ®? This is where the parametricity 
and well-behavedness conditions on the state enter the picture. 

We start with well-behavedness. Assume x G K/, which is unam¬ 
biguously represented in 51 as q- x. Fet y'l,..., jm G J be the content 
items of t,- x (placed in various slots in the shape of x). To evaluate /)■ 
on X, we first corecursively destruct the jfs while also keeping the 
originals, thus replacing each y/ with (y;, dtory';). Then we apply 
the transformation A to obtain an element of F (5I*J), which has 
an F-shape at the top (the first produced observable data) and for 
each slot in this shape an element of 51* J, i.e., a formal-expression 
tree having leaves in J and built using operation symbols from the 
signature (the corecursive continuation): 

K,.j-^5:j^<^*°^>r(jxFj) Af(5:*j) 

In summary, A is a schematic representation of the mutually core¬ 
cursive behavior of the well-behaved operations up to the produc¬ 
tion of the first observable data. This intuition is made formal in the 
well-behavedness condition, which states that the diagram in Fig- 
uref^commutes for each f. (We could replace the right upward ar- 
rovt^beled by ctor with a downward arrow labeled by dtor without 
changing the diagram’s meaning. However, we consistently prefer 
the constructor view in our exposition.) 

In the above explanations, we saw that it suffices to peel off one 
layer of the arguments y,- (by applying dtor) for a well-behaved 
operation fj to produce, via A, one layer of the result and to 
delegate the rest of the computation to a context consisting of a 
combination of well-behaved operations (an element of 51* J). But 
how to formally express that exploring one layer is enough, i.e., 
that applying A : J x F J —> F (5I*J) to (y), dtor y/) does not result 
in a deeper exploration? An elegant way of capturing this is to 
require that A, which is a polymorphic function, operates without 
analyzing J, i.e., that it operates in the same way on A x F A ^ 
F (5I*A) for any set A. This requirement is precisely parametricity. 

Strictly speaking, the well-behaved operations / are a redundant 
piece of data in the state (K, /, A), since, assuming A parametric, 
we can prove that there exists a unique tuple / that satisfies the 
well-behavedness condition. In other words, the operations / could 
be derived on a per-need basis. 

Example 4. Fet J = Stream and assume that SCons : Nat x 
Stream —> Stream and ©: Stream^ —t Stream are the only well- 
behaved operations registered so far. Then Ki = {AB. Nat x S), 
/i = SCons, K 2 = {AB. B^), and /2 = ©. Moreover, 51* = Ifp {AB. 
A + (Nat X B A- B^)) consists of formal-expression trees with 
leaves in A and built using arity-correct applications of operation 
symbols corresponding to SCons and ©, denoted by SConsI and 
[^ . Given n G Nat and a, Zt G A, an example of such a tree is 
leaf a SConsI (n, leaf a leaf b). If additionally A = J, then 
eval applied to the above tree is a © SCons n{a © b). 

But what is A? As we show below, we need not worry about 
the global definition of A, since both 51 and A will be updated 
incrementally when registering new operations as well behaved. 
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Nonetheless, a global definition of A for SCons and © follows: 

Az = case z of 

ti (n, {a, (m, a')) =f> (n, SConsK m. leaf a')) 

12 {{a, {m, a')), {b, («, b'))) =f> {m + n, leaf a' leaf b') 

Informally, SCons and © exhibit the following behaviors: 

• to evaluate SCons on a number n and an item a with (head a, 
tail a) = (m, a'), produce n and evaluate SCons on m and a', 
i.e., output SCons n (SCons m a') = SCons n a; 

• to evaluate © on a, b with (head a, tail a) = (m,a') and 
(head fo, tail b) = (n,b'), produce m + n and evaluate © on 
a' and b', i.e., output SCons (m + n) (a' © b'). 

3.4 Corecursion Up-to 

A corecursion state (K, /, A) for an F-defined codatatype J consists 
of a collection of operations on J, f, : K,- J —>■ J, that satisfy the 
well-behavedness properties expressed in terms of a parametric 
function A. We are now ready to harvest the crop of this setting: a 
corecursion principle for defining functions having J as codomain. 

The principle will be represented by two corecursors, corecTop 
and corecFlex. Although subsumed by the latter, the former is inter¬ 
esting in its own right and will give us the opportunity to illustrate 
some fine points. Below we list the types of these corecursors along 
with that of the primitive corecursor for comparison: 

Primitive corecursor: 

corecPrim : nAGSet('4 -A- F A) —> A —> J 

Top-guarded corecursor up-to: 

corecTop : nAGSet(^ ^ F (^*^)) —> A —> J 

Flexibly guarded corecursor up-to: 

corecFlex : rLlgSet(^ ^ (F (^*^))) —>■ A —>■ J 

Figure presents the diagrams whose commutativity properties 
give the characteristic equations of these corecursors. 

Each corecursor implements a contract of the following form: 
If, for each a E A, one provides the intended corecursive behavior 
of g fl represented as i a, where .? is a function from A, one obtains 
the function g : A J (as the corresponding corecursor applied to 
i) satisfying a suitable fixpoint equation matching this behavior. 

The codomain of s is the key to understanding the expressive¬ 
ness of each corecursor. The intended corecursive calls are repre¬ 
sented by A, and the call context is represented by the surrounding 
combination of functors (involving F, H*, or both): 

• for corecPrim, the allowed call contexts consist of a single 
constructor guard (represented by F); 

• for corecTop, they consist of a constructor guard (represented 
by F) followed by any combination of well-behaved operations 
fi (represented by H*); 

• for corecFlex, they consist of any combination of well-behaved 
operations satisfying the condition that on every path leading 
to a corecursive call there exists at least one constructor guard 
(represented by T.* (F (r*_))). 

We can see the computation of g a by following the diagrams in 
Figure|^counterclockwise from their left-top comers. The applica¬ 
tion ^ a first builds the call context syntactically. Then g is applied 
corecursively on the leaves. Finally, the call context is evaluated: 
for corecPrim, it consist only of the guard (ctor); for corecTop, it 
involves the evaluation of the well-behaved operators (which may 
also include several occurrences of the guard) and ends with the 
evaluation of the top guard; for corecFlex, the evaluation of the 
guard is interspersed with that of the other well-behaved operations. 

Example 5. For each example from Section|2^ we give the core¬ 
cursors that can handle it (assuming the necessary well-behaved 


operations were registered): 

©, everyOther: corecFlex, corecTop, corecPrim 

onetwos, fibA, ®, exp, sup: corecFlex, corecTop 
fibB, facA, facB: corecFlex 

With the usual identification of Unit J and J, we can define fibA 
and facA as follows: 

fibA = corecTop {Au : Unit. (0, ISConsI fl, leaf u) (leaf u))) 
facA = corecFlex (Au : Unit, leaf (1, leaf u) leaf (1, leaf u)) 

Let us look at fibA closely, comparing its specification fibA = 
SCons 0 (SCons 1 fibA © fibA) with its definition in terms of 
corecTop. The outer SCons guard (with 0 as first argument) cor¬ 
responds to the outer pair (0, _). The inner SCons and © are inter¬ 
preted as well-behaved operations and represented by the symbols 
ISConsI and (cf. Example]^. Finally, the corecursive calls of 
fibA are captured by leaf u. 

The desired specification can be obtained from the corecTop 
form by the characteristic equation of corecTop (for A = Unit) and 
the properties of eval as follows, where we simply write s, fibA, 
and leaf instead of their applications to the unique element () of 
Unit, namely i- (), fibA (), and leaf (): 

fibA 

= {by the commutativity of Figure 

with fibA = corecTop j} 

ctor (F (eval o H* fibA) i) 

= (by the definitions of F and j} 

SCons 0 ((eval o T.* fibA) ( ISConsK l, leaf) (leaf))) 

= {by the definition of S*} 

SCons 0 (eval ( ISConsK l. leaf fibA) (leaf fibA)) 

= {by the definition of eval} 

SCons 0 (SCons 1 fibA © fibA) 

The elimination of the corecTop infrastmcture relies on simplifica¬ 
tion rules for the involved operators and can be fully automatized. 

Parametricity and well-behavedness are crucial for proving that 
the corecursors actually exist: 

Theorem 6. There exist the polymorphic functions corecTop and 
corecFlex making the diagrams in Figures and commute. 
Moreover, for each s of appropriate type, corecTop j or corecFlex i 
is the unique function making its diagram commute. 

Theorem]^ is a known result from the category theory litera¬ 
ture: The corecTop i- version follows from the results in Bartels’s 
thesis (9), whereas the corecFlex i- version was very recently (and 
independently) proved by Milius et al. 1381 Theorem 2.16]. 

3.5 Initializing the Corecursion State 

The simplest relaxation of primitive corecursion is the allowance 
of multiple constructors in the call context, in the style of Coq, 
as in the definition of onetwos (Section Since this idea is 
independent of the choice of codatatype J, we realize it when 
bootstrapping the corecursion state. 

More precisely, upon defining a codatatype J, we take the fol¬ 
lowing initial corecursion state initState = (K, /, A): 

• A" is a singleton consisting of (a copy of) F; 

• / is a singleton consisting of ctor; 

• ^ • ILlGSet F (A X F A) —> F (F* A) is defined as F (op o F leaf o 
snd), where snd is the second product projection. 

Recall that the seed A is designed to schematically represent the 
corecursive behavior of the registered operations by describing 
how they produce one layer of observable data. The definition in 
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corecTop s 


. corecPrim s , 

A-s- J 


' F (corecPrim .?) 

F A -_ >- F J 

(a) Primitive corecursion 



corecFlex s 


^ J 


r*(F(rM)) 

Z*(F (Z* (corecFlex 


eval 

Z* ctor 


F(r*j) 

(b) Top-guarded corecursion up-to 

Figure 2: The corecursors 


Z* (Feval) 

Z* (F(Z*J)) —-(F J) 


(c) Flexibly guarded corecursion up-to 


F(AxFA) 


F snd 


F (F*A) 

F op 


F(Fleaf) 

F(FA) —^—^F(F(F*A)) 

(a) Definition of A for the initial state 


K J 


g — corecTop s 



Figure 3: Definitions of A and g 


Figure depicts this for ctor and instantiates to the schematic 
behavior of SCons presented at the end of Examplej^ 

Theorem 7. initState is a well-formed corecursion state—i.e., it 
satisfies parametricity and well-behavedness. 


Theorem 8. If (K, /, A) is a well-formed corecursion state, so is 
nextStateg (K, /, A). 

In summary, we have the following scenario triggering the 
state’s advancement: 


3.6 Advancing the Corecnrsion State 

The role of a corecursion state (K, /, A) for J is to provide infras¬ 
tructure for flexible corecursive definitions of functions g between 
arbitrary sets A and J. If nothing else is known about A, this is 
the end of the story. However, assume that J is a component of A, 
in that A is constructed from J (possibly along with other compo¬ 
nents). For example, A could be List J, or J x (Nat —> List J). We 
capture this abstractly by assuming A = K J for some functor K. 

In this case, we have a fruitful situation of which we can profit 
for improving the corecursion state, and hence improving the flex¬ 
ibility of future corecursive definitions. Under some uniformity as¬ 
sumptions, g itself can be registered as well behaved. 

More precisely, assume that g : K J -A J is defined by g = 
corecTop v and that s can be proved to be uniform in the following 
sense: There exists a parametric function p : IXteSet K (A x F A) — 
F (51’^ (K A)) such that s = p o K(id, dtor) (Figure 3b i. Then we 


can integrate g as a well-defined operation as follows. We define 
nextStatej(K,/, A), the “next” corecursion state triggered by g, 
as (K', /', A'), where 


• K' = (Ki,..., K„, K) (similarly to 51 versus K, we write H' 
for the signature functor of K'; note that we essentially have 
r' = r + K); 


1. One defines a new operation g = corecTop s. 

2. One shows that i factors through a parametric function p and 
K (id, dtor) (as in Figure[3b|; in other words, one shows that g’s 
corecursive behavior s decomposes into a one-step destruction 
of the arguments and a parametric transformation (which is 
independent of J). 

3. The corecursion state is updated by nextState^. 

Example 9. The operations onetwos, ©, 0, and exp from Sec¬ 
tion are covered by this scenario. For example, assume that 
SCons and © are regi^stered as well behaved at the time of defin¬ 
ing 0 (cf. Example!^. Then K = (TA. A^) and 0 = corecTop s, 
where 


.? = (/l(xj, ys). (head xs x head ys, 

leaf (xs, tail ys) leaf (tail xs, ys))) 

The function s decomposes intop o K (id, (head, tail)), where 

P : nAeSet(^ X (Nat x A))^ -> Nat x 51* (A^) 

is defined by p ({a, (m, a')), (b, (n, b'))) = (mxn, (a, b')\^(a', b)), 
which is clearly parametric. The act of determining p from 5 and 
K (id, (head, tail)) is syntax-directed. 


• /' = (/d 

• • riteSet E' (A X FA) —> F (5I'*A) is defined as [Ao F embL, 
p o F embR] where [_, _] is the case operator on sums, which 
builds a function [u,v]:B + C^D from two functions u: 

D and V: C D, and embL : 51’^A —> 5I'*A, embR : 51* (K A) 

5I'*A are the natural embeddings into 5I'*A. 


3.7 Coinduction Up-to 

In a proof assistant, specification mechanisms are not very useful 
unless they are complemented by suitable reasoning infrastructure. 
The natural counterpart of corecursion up-to is coinduction up-to. 
In our incremental framework, the expressiveness of coinduction 
up-to grows together with that of corecursion up-to. 
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We start with structural coinduction ED, allowing to prove two 
elements of J equal by exhibiting an F-bisimulation, i.e., a binary 
relation on J such that whenever two elements j\ and 72 are 
related, their dtor-unfoldings are componentwise related by R. 

R j\ h Vy'l h £ -I- ^ h h —» Frel R (dtor ji) (dtor 72 ) 

h = h 

Recall that our type constructors are not only functors but also 
relators. The notion of “componentwise relationship” refers to F’s 
relator structure Frel. 

Upon integrating a new operation g (Section [3.6[ l, the coinduc¬ 
tion rule is made more flexible by allowing the dtor-unfoldings to 
be componentwise related not only by R but more generally by a 
closure of R that takes g into account. 

For a corecursion state (K,/, A) and a relation R : J —>■ J —>■ 
Bool, we define cljR, the /-congruence closure of R, as the small¬ 
est equivalence relation that includes R and is compatible with each 
fi : Ki J ^ J: 

Vzi,Z2 e K; J. KreIjRzi Z2 — c\jR (/zi) [fizi) 

where Krel; is the relator associated with K,. 

The next theorem supplies the reasoning counterpart of the 
definition principle stated in Theorem]^ It can be inferred from 
recent, more abstract results EH. 

Theorem 10. The following coinduction rule up to / holds in the 
corecursion state (K, /, A): 

R jl h Vy'i h e J. R j\ h —^ Frel (cljR) (dtor ;i) (dtor 72 ) 

h = h 

Coinduction up to / is the ideal abstraction for proving equal¬ 
ities involving functions defined by corecursion up to /: For ex¬ 
ample, a proof of commutativity for ® naturally relies on con¬ 
texts involving ©, because ®’s corecursive behavior (i.e., ®’s dtor- 
unfolding) depends on ©. 

4. Mixed Fixpoints 

When we write fixpoint equations to define a function /, we often 
want to distinguish corecursive calls from calls that are sound 
for other reasons—for example, if they terminate. We model this 
situation abstractly by a function i : A T* (F (r*A) + r*A). As 
usual for each a, the shape of j a represents the calling context for 
/ a, with the occurrences of the content items d \n sa representing 
calls to / a'. The new twist is that we now distinguish guarded calls 
(captured by the left-hand side of +) from possibly unguarded ones 
(the right-hand side of +). 

We want to define a function / with the behavior indicated by 
J, i.e., making the diagram in Figure [4b]commute. In the figure, + 
denotes the map function u + viR + C-^ZI + R built from two 
functions u : B ^ D and v : C —S" £. In the absence of pervasive 
guards, we cannot employ the corecursors directly to define /. 
However, if we can show that the noncorecursive calls eventually 
lead to a corecursive call, we will be able to employ corecFlex. 
This precondition can be expressed in terms of a fixpoint equation. 
According to Figure [4a| the call to g (shown on the base arrow) 
happens only on the right-hand side of +, meaning that the intended 
corecursive calls are ignored when “computing” the fixpoint g. Our 
goal is to show that the remaining calls behave properly. 

The functions reduce and eval that complete the diagrams of 
Figurej^are the expected ones: 

• The elements of T.* (F (II*A)) are formal-expression trees 
guarded on every path to the leaves, and so are the elements 
T* (F (r*A) + T* (T* (F (r*A)))), but with a more restricted 
shape; reduce embeds the latter in the former: reduce = flat o 


5I*[leaf, flat], where flat : IliGSet t A is the stan¬ 

dard join operation of the II*-monad. 

• eval (F + ) evaluates all the formal operations of T.*: 

eval (p ^ = eval o 51* ([ctor o F eval, eval]) 

Theorem 11. If there exists (a unique) g : A —>• 51* (F (r*A)) such 
that the diagram in Figure [4a| commutes, there exists (a unique) 
/ : A —> J such that the diagram in Figure [4b| commutes, namely, 
corecFlex g. 

The theorem certifies the following procedure for making sense 
of a mixed fixpoint definition of a function /: 

1. Separate the guarded and the unguarded calls (as shown in the 
codomain 51* (F (II*A) + 5I*A) of i). 

2. Prove that the unguarded calls eventually terminate or lead to 
guarded calls (as witnessed by g). 

3. Pass the unfolded guarded calls to the corecursor—i.e., take 
/ = corecFlex g. 

Example 12. The above procedure can be applied to define facC, 
primes : Nat ^ Nat ^ Stream, and cat : Nat ^ Stream, while 
avoiding the unsound nasty (Section [23} . A simple analysis reveals 
that the first self-call to primes is guarded while the second is not. 
We define g : Nat x Nat —>■ 51* (Nat x 51* (Nat x Nat)) by 

g (m, «) = if (m = 0 A n > 1) V gcd mn=l 
then leaf (n, leaf (ra x «, n + 1)) 
else g (m, n+ 1) 

In essence, g behaves like (the intended) / except that the guarded 
calls are left symbolic, whereas the unguarded calls are interpreted 
as actual calls to g. One can show that g is well defined by a 
standard termination argument. This characteristic equation of g is 
the commutativity of the diagram determined by i- as in Figure]^ 
where .? : Nat x Nat —> 51* (Nat x 51* (Nat x Nat) + 51* (Nat x 
Nat)) is defined as follows (with Ini and I nr being the left and right 
sum embeddings): 

s (m, n) = if(m = 0An>l)V gcd mn= 1 

then leaf (Ini (leaf (n, leaf {mxn,n + 1)))) 
else leaf (Inr (leaf (ra, n+ 1))) 

Setting primes = corecFlex g yields the desired characteristic equa¬ 
tion for primes after simplification (cf. Example]^. 

The primes example has all unguarded calls in tail form, which 
makes the associated function g tail-recursive. This need not be the 
case, as shown by the cat example, whose unguarded calls occur 
under the well-behaved operation ©. However, we do require that 
the unguarded calls occur in contexts formed by well-behaved op¬ 
erations alone. After unfolding all the unguarded calls, the resulting 
context that is to be handled corecursively must be well behaved— 
this precludes unsound definitions like nasty. 

5. Formalization and Implementation 

We formalized in Isabelle/HOL the metatheory of Sections[^and|^ 
Essentially, this means that the results have been proved in higher- 
order logic with Infinity, Choice, and a mechanism for defining 
types by exhibiting non-empty subsets of existing types. The logic 
is comparable to Zermelo set theory with Choice (ZC) but weaker 
than ZFC. The development would work for any class of functors 
that are relators (or closed under weak pullbacks), contain basic 
functors (identity, (co)products, etc.) and are closed under intersec¬ 
tion, composition, and have initial algebras and final coalgebra that 
can be represented in higher-order logic. However, our Isabelle de¬ 
velopment focuses on a specific class 1491 . 
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r* (F {T*A)) 
A 

reduce 


/ = corecFlex ^ 


^L(f_+_) 


Z* (id+Z*?) E* (F (Z*f)+Z*f) 

T* (F (r*A) + T*A) — -^ T* (F (T*A) + T* {T* (F (T*A))))) T* (F {T.*A) + T.*A) - U T* (F (r*J) + T* J) 


(a) Assumption 


(b) Conclusion 


Figure 4: Mixed fixpoint 


The formalization consists of two parts: The base derives a core¬ 
cursor up-to from a primitive corecursor; the step starts with a core¬ 
cursor up-to and integrates an additional well-hehaved operation. 

The base part starts by axiomatizing a functor F and defines a 
codatatype with nesting through F: codatatype J = ctor (F J). 
(In general, J could depend on type variables, but this is an orthog¬ 
onal concern that would only clutter the formalization.) Then the 
formalization defines the free algebra over F and the basic corecur¬ 
sor seed A for initializing the state with ctor as well behaved (Sec- 
tion |3.5^ . It also needs to lift A to the free algebra, a technicality that 
was omitted in the prese ntatio n. Then it defines eval and other nec¬ 
essary structure (Section [3.3^ . Finally, it introduces corecTop and 
corecFlex (Sectio n|3.4| > and derives the corresponding coinduction 
principle (Section [3.7^ . 

From a high-level point of view, the step part has a somewhat 
similar structure to the base. It axiomatizes a domain functor K 
and a parametric function p associated with the new well-behaved 
operation g to integrate. Then it extends the signature to include 
K, defines the extended corecursor seed A', and lifts A' to the 
free algebra. Next, it defines the parameterized evaig and other 
infrastructure (Section |3.6| >. Finally, it introduces corecTop and 
corecFlex for the new state and derives the coinduction principle. 

The process of instantiating the metatheory to particular user- 
specified codatatypes is automated by a prototype tool: the user 
points to a particular codatatype (typically defined using Isabelle’s 
existing (co)datatype specification language mi), and then the tool 
takes over and instantiates the generic corecursor to the indicated 
type, provinding the concrete corecursion and mixed recursion- 
corecursion theorems. The stream and tree examples presented in 
Section 1^ have all been obtained with this tool. As a larger case 
study, we formalized all the examples from the extended version 
of Hinze and James’s study ( 23 . The parametricity proof obliga¬ 
tions were discharged by Isabelle’s parametricity prover Gl. The 
mixed recursion-corecursion definitions were done using Isabelle’s 
facility for defining terminating recursive functions 1291 . 

Unlike Isabelle’s primitive (co)recursion mechanism fT^ . our 
tool currently lacks syntactic sugar support, so it still requires 
some boilerplate from the user, namely the explic invocation of the 
corecursor and the parametricity prover: these are just a few extra 
lines of script per definition, and therefore the tool is also usable 
in the current form. Following the design of its primitive ancestor, 
its envisioned fully user-friendly extension will replace the explicit 
invocation of the corecursor with a corec command, allowing 
users to specify a function f corecursively and then performing the 
following steps (cf. Example]^: 

1. Parse the specification of f and synthesize arguments to the 
current, most powerful corecursor. 

2. Define f in terms of the corecursor. 

3. Derive the original specification from the corecursor theorems. 

Passing the well_behaved option to corec will additionally invoke 
the following procedure (cf. Example]^: 


4. Extract a polymorphic function p from the specification of f. 

5. Automatically prove p parametric or pass the proof obligation 

to the user. 

6. Derive the new strengthened corecursor and its new coinduction 

principle. 

The corec command will be complemented by an additional com¬ 
mand, tentatively called well_behaved_for_ corec, for registering 
arbitrary operations f (not necessarily defined using corec) as well 
behaved. The command will ask the user to provide a corecursive 
specification of f as a lemma of the form f x = Cons ... and then 
perform steps 4 to 6. The corec command will become stronger 
and stronger as more well-behaved operations are registered. 

The following Isabelle theory fragment gives a flavor of the 
envisioned functionality from the user’s point of view: 

codatatype Stream A = SCons (head: A) (tail: Stream A) 

corec (well_behaved) © : Stream ^ Stream —> Stream 
xj © yi = SCons (head xs + head ys) (tail © tail ys) 

corec (well_behaved) ® : Stream ^ Stream Stream 
xj © yi = SCons (head xs x head y.s') 

((xj © tail ys) © (tail xs © yi)) 

lemma ®_commute: xj © yj = yi © xs 

by {coinduction arbitrary: xs ys rule: stream.coinduct) auto 

lemma ®_commute: xs ®ys =ys ® xs 

proof {coinduction arbitrary: xs ys rule: stream.coinductjupto) 
case Eq_stream 
thus ?case unfolding tail_® 

by {subst (B_commute) {auto intro: stream.cljff) 

qed 

6. Related Work 

There is a lot of relevant work, concerning both the metatheory and 
applications in proof assistants and similar systems. We referenced 
some of the most closely related work in the earlier sections. Here 
is an attempt at a more systematic overview. 

Category Theory. The notions of corecursion and coinduction 
up-to started with process algebra 14411471 before they were re¬ 
cast in the abstract language of category theory iiiEaiiiiiiiiEii 
14^1501 . Our approach owes a lot to this theoretical work, and in¬ 
deed formalizes some state-of-the-art category theoretical results 
on corecursion and coinduction up-to 13811431 . Besides adapting 
existing results to higher-order logic within an incremental core¬ 
cursor cycle, we have also extended the state of the art with a sound 
mechanism for mixing recursion with corecursion up-to. 

Category theory provides an impressive body of abstract results 
that can be applied to solve concrete problems elegantly. Proof 
assistants have a lot to benefit from category theory, as we hope 
to have demonstrated with this paper. There has been prior work 
on integrating coinduction up-to techniques from category theory 
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into these tools. Hensel and Jacobs l20t illustrated the categori¬ 
cal approach to (co)datatypes in PVS via axiomatic declarations 
of various flavors of trees with (co)recursors and proof principles. 
Popescu and Gunter proposed incremental coinduction for a deeply 
embedded proof system in Isabelle/HOL BOl . Hur et al. dll ex¬ 
tended Winskel’s m and Moss’s 1391 parameterized coinduction 
and studied applications to Agda, Coq, and Isabelle/HOL. Endrullis 
et al. OH developed a method to perform up-to coinduction in Coq 
adapting insight from behavioral logic EH. To our knowledge, no 
prior work has realized corecursion up-to in a proof assistant. 

Ordered Structures and Convergence. A number of approaches 
to define functions on infinite types are based on domain theory, or 
more generally on ordered structures and notions of convergence, 
including Matthews EU, Di Gianantonio and Miculan (m, Huff¬ 
man dll, and Lochbihler and Hdlzl 1361 . These are not directly 
comparable to our work because they do not guarantee productiv¬ 
ity or otherwise offer total programming. They also force the user 
to switch to a different, richer universe of domains or to define or¬ 
dered structures and perform continuity proofs (although Matthews 
shows that this process can be partly automated). 

Strictly speaking, our approach does not guarantee productivity 
either. This is an inherent limitation of the semantic (shallow em¬ 
bedded) approach in HOL systems, which do not specify a compu¬ 
tational model (unlike Agda and Coq). Productivity can be argued 
informally by inspecting the characteristic corecursion equations. 

Syntactic Criteria. Proof assistants based on type theory include 
checkers for termination of recursion functions and productivity 
of corecursive functions. These checkers are part of the system’s 
trusted code base; bugs can lead to inconsistencies, as we saw for 
Agda Il48l and Coq I16l [^ For users, such syntactic criteria are also 
inflexible; for example, Coq allows more than one constructor to 
appear as guards but is otherwise limited to primitive corecursion. 

To the best of our knowledge, the only deployed system that ex¬ 
plicitly supports mixed recursive-corecursive definitions is Dafny. 
Leino and Moskal’s paper EH triggered our interest in the topic. 
Unfortunately, the paper is not entirely clear about the supported 
fragment. A naive reading suggests that the inconsistent nasty ex¬ 
ample from Section |2.3| is allowed, as was the case with earlier 
versions of Dafny. Newer versions reject not only nasty but also 
the legitimate cat function from the same subsection. 

Type Systems. A more flexible alternative to syntactic criteria 
is to have users annotate the functions’ types with information 
that controls termination and productivity. Approaches in these 
category include fair reactive programming Il4l 11811301 . clock 
variables GlIISl. and sized types El - Size types are implemented in 
MiniAgda 01 and in newer versions of Agda, in conjunction with 
a destructor-oriented (copattern) syntax for corecursion p5(. These 
approaches, often featuring a blend of type systems and notions of 
convergence, achieve a higher modularity and trustworthiness, by 
moving away from purely syntactic criteria and toward semantic 
properties. By carefully tracking sizes and timers, they allow for 
more general contexts than our well-behavedness criterion. Our 
approach captures a 1-1 contract: A well-behaved function can 
destroy one constructor to produce one. A function f that would, 
map the stream ai,a 2 ,... to ai, ai, 02 , 02 , ■ ■ ■ would have a 1- 
2 contract. And a function g mapping ai, 02 , a^, 04 , ... to a\ A 
02,02 + 04 , ... would require a 2-1 contract. The composition g o f 
would yield a 1-1 contract and could in principle appear in a 
corecursive call context, but our framework does not allow it. 


* In all fairness, we should mention that critical bugs were also found in the 
primitive definitional mechanism of our proof assistant of choice, Isabelle 
ED Our point is not that brand B is superior to brand A, but rather that it 
is generally desirable to minimize the amount of trusted code. 


Clock variables and sized types require an extension to the type 
system and burden the types. These general contracts must be spec¬ 
ified by the user and complicate the up-to corecursion principle; the 
arithmetic that ensures that contracts fit together would have to be 
captured in the principle, giving rise to new proof obligations. In 
contrast, well-behaved functions can be freely combined. This is 
the main reason why we can claim it is a “sweet spot.” 

There is a prospect of embedding our lighter approach into such 
heavier but more precise frameworks. Our well-behaved operators 
possibly form the maximal class of context functions requiring no 
annotations (in general), amounting to a lightweight subsystem of 
Krishnaswami and Benton’s type system 1301 . 

7. Conclusion 

We presented a formalized framework for deriving rich corecursors 
that can be used to define total functions producing codatatypes. 
The corecursors gain in expressiveness with each new corecursive 
function definition that satisfies a semantic criterion. They consti¬ 
tute a significant improvement over the state of the art in the world 
of proof assistants based on higher-order logic, including HOL4, 
HOL Light, Isabelle/HOL, and PVS. Trustworthiness is attained at 
the cost of elaborate constructions. Coinduction being somewhat 
counterintuitive, we argue that these safeguards are well worth the 
effort. As future work, we want to transform our prototype tool into 
a solid implementation inside Isabelle/HOL. 

Although we emphasized the foundational nature of the frame¬ 
work, many of the ideas equally apply to systems with built-in co¬ 
datatypes and corecursion. One could imagine extending the pro¬ 
ductivity check of Coq to allow corecursion under well-behaved 
operations, linking a syntactic criterion to a semantic property, as 
a lightweight alternative to clock variables and sized types. The 
emerging infrastructure for parametricity in Coq 11111271 would 
likely be a useful building block. 
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